TAKE AWAY POINTS FROM THIS POST
With ggmap you can use Google Maps within the ggplot2 environment.
Delaunay triangulation is a way to compute at the area and centroid of a strangly shaped polygon, and can be computed in R with the deldir pacakge.
Centroids of cities’ important places tend to be by rivers. Unless there is an ocean involved.
Today’s data is the location of stops in metro systems in European four cities: Paris, Berlin, Barcelona, and Prague. To collected the names of stops from each cities respective Wikipedia article. I also coded if the stop actually was in the city being analyzed or a different top, usually bordering the city.
With my data in place I began to work with it in R to organize it. I used three packages to start off, dplyr, tidyr, and ggmap. The packages dplyr and tidyr have been discussed previously in this blog [ADD LINK], but ggmap is new. With ggmap you can download maps from various sources, including Google Maps, and plot with it in the ggplot2 environment. I first read in my data and then create a new column called “geo_location” by combining the “station” and “location” columns with a “unite()” call. I also use a “separate()” call, the converse of “unite()” to split the “opened” column (which refers to the dat when the stop was opened) into three columns, one for month, day and year. now I get to use my first ggmap call, “mutate_geocode()”. I can feed the call my “geo_location” column from my data frame and it will make two new columns, “lon” and “lat”, find the longitude and latitude of each stop, and add these values to my new columns. Note, I originally tried added the word “Station” at the end of the stop for all stops but this caused problems. Also, the output from Google Maps is not exactly the same as the Google Maps API. I tried to correct errors as much as possible, but I am not an expert on European Metro systems. If you see an erroneous data point from your city feel free to let me know!
library(dplyr)
library(tidyr)
library(ggmap)
# data = read.table("data_metros.txt", header=T, sep="\t") %>%
# unite(geo_location, c(station, location), sep = ", ", remove = FALSE) %>%
# separate(opened, into = c("opened_month", "opened_day", "opened_year"), sep = "/") %>%
# mutate_geocode(geo_location, source = "google")
data = read.table("data_metro_full.txt", header = T, sep="\t")
Let’s take a look our data frame with our new columns. To do this in a prettier way we can use the DT package.
With our data in place we can start making our maps. This brings us to the second ggmaps call, “get_googlemap()”. With this call I can download city specific maps for my four cities. I can set the type of map (terrain, satellite, roadmap, hybrid), how close to zoom in (integers that range from continent to building), the size of my map in pixels, and if I want the map in black and white or color.
paris_map = get_googlemap(center = "Paris", maptype = "roadmap",
zoom = 11, size = c(640, 420), color = "bw")
berlin_map = get_googlemap(center = "Berlin", maptype = "roadmap",
zoom = 10, size = c(640, 420), color = "bw")
barcelona_map = get_googlemap(center = "Barcelona", maptype = "roadmap",
zoom = 11, size = c(640, 420), color = "bw")
prague_map = get_googlemap(center = "Prague", maptype = "roadmap",
zoom = 11, size = c(640, 420), color = "bw")
With our map object saved from Google can now plot our maps and our metro stops on top. Since I’ll be making roughly the same plot each time I wrote a function which you can see below. The main difference from a typical ggplot2 plot is instead of using “ggplot()” to start off the plot you use “ggmap()” and then feed it the map we had saved. From then on it takes the same ggplot2 calls as any other plot. For example, we can use “geom_point()” to plot our metro stops.
city_plot = function(city_name, city_map){
ggmap(city_map, extent = "device") +
geom_point(data = subset(data, city == city_name), aes(x = lon, y = lat),
color = "#0571b0", size = 4)
}
To plot the maps I’ll be using a call from the package purr, “map2()”, which allows me to run a function in batch. First I save two lists, one of my cities and one of my mpas for my cities, then I call “map2()”. Just as my function “city_plot()” takes two arguments, I feel a list of all values for those arguments, “cities” and “maps”, and then the function itself to the “map2()”. This wil run teh “city_plot()” function over each city and map combination. See the maps with metro stops for the four cities below. I’m also
library(purrr)
cities = list("Paris", "Berlin", "Barcelona", "Prague")
maps = list(paris_map, berlin_map, barcelona_map, prague_map)
city.plots = map2(cities, maps, city_plot)
city.plots
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
# paris.plot = city_plot("Paris", paris_map)
# paris.plot
#
# berlin.plot = city_plot("Berlin", berlin_map)
# berlin.plot
#
# barcelona.plot = city_plot("Barcelona", barcelona_map)
# barcelona.plot
#
# prague.plot = city_plot("Prague", prague_map)
# prague.plot
With our maps and data points in place let’s compute the triangulations for each city. We do this with the deldir package.
library(deldir)
make_deldir = function(df){
deldir(df$lon, df$lat)
}
data_deldir = data %>%
nest(-city, .key = location_info) %>%
mutate(deldir = map(location_info, make_deldir)) %>%
mutate(delsgs = map(deldir, "delsgs")) %>%
mutate(del.area = map(deldir, "del.area")) %>%
mutate(summary = map(deldir, "summary"))
data_deldir_delsgs = data_deldir %>%
select(city, delsgs) %>%
unnest()
data_deldir_cent = data_deldir %>%
select(city, summary) %>%
unnest() %>%
group_by(city) %>%
summarise(cent_x = sum(x * del.wts),
cent_y = sum(y * del.wts)) %>%
ungroup()
Now we can update our figures with the triangulations. I’ve again made a function to run one the four maps. First we take our preivously made map with the data points and then we can use “geom_segment()” to draw the edges of the triangles on top. See the four updated maps below.
del_plot = function(city_name, city_map){
ggmap(city_map, extent = "device") +
geom_segment(data = subset(data_deldir_delsgs, city == city_name), aes(x = x1, y = y1, xend = x2, yend = y2),
size = 1.5, color= "#92c5de") +
geom_point(data = subset(data, city == city_name), aes(x = lon, y = lat),
color = "#0571b0", size = 4) +
geom_point(data = subset(data_deldir_cent, city == city_name),
aes(x = cent_x, y = cent_y),
size = 8, color= "#ca0020")
}
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
Let’s compare the areas. The area for Paris is [ADD], Berlin [ADD], Barelona [ADD], and Prague [ADD].